A video conference system includes an endpoint that captures audio and video of participants in a room during a conference session, for example, and then transmits the audio and video to a conference server or to a “far-end” endpoint. The one or more cameras of a video conference endpoint may be fixed or, if adjustable, somewhat difficult to manipulate. In some instances, during a video conference session, the one or more cameras of the video conference endpoint may not be able to convey a sufficient contextual understanding of the events and topics of discussion at a video conference endpoint to the far-end participants at a far-end endpoint.